gabriel turinici
The convergence of the Stochastic Gradient Descent (SGD) : a self-contained proof
The Stochastic Gradient Descent (SGD) or other algorithms derived from it are used extensively in Deep Learning, a branch of Machine Learning; but the proof of convergence is not always easy to find. The goal of this paper is to adapt various proofs from the literature in a simple format. In particular no claim of originality is made (see [1-4] for some of my recent research papers in this area); on the contrary please cite this work if you find it useful (arxiv or DOI: 10.5281/zenodo.4638695). This proof can be used in any domain where a self-contained presentation is needed.